LICRE: unsupervised feature correlation reduction for lipidomics
نویسندگان
چکیده
MOTIVATION Recent advances in high-throughput lipid profiling by liquid chromatography electrospray ionization tandem mass spectrometry (LC-ESI-MS/MS) have made it possible to quantify hundreds of individual molecular lipid species (e.g. fatty acyls, glycerolipids, glycerophospholipids, sphingolipids) in a single experimental run for hundreds of samples. This enables the lipidome of large cohorts of subjects to be profiled to identify lipid biomarkers significantly associated with disease risk, progression and treatment response. Clinically, these lipid biomarkers can be used to construct classification models for the purpose of disease screening or diagnosis. However, the inclusion of a large number of highly correlated biomarkers within a model may reduce classification performance, unnecessarily inflate associated costs of a diagnosis or a screen and reduce the feasibility of clinical translation. An unsupervised feature reduction approach can reduce feature redundancy in lipidomic biomarkers by limiting the number of highly correlated lipids while retaining informative features to achieve good classification performance for various clinical outcomes. Good predictive models based on a reduced number of biomarkers are also more cost effective and feasible from a clinical translation perspective. RESULTS The application of LICRE to various lipidomic datasets in diabetes and cardiovascular disease demonstrated superior discrimination in terms of the area under the receiver operator characteristic curve while using fewer lipid markers when predicting various clinical outcomes. AVAILABILITY AND IMPLEMENTATION The MATLAB implementation of LICRE is available from http://ww2.cs.mu.oz.au/∼gwong/LICRE
منابع مشابه
EANN2014: Unsupervised Feature Selection for Sensor Time-series in Pervasive Computing Applications
The paper introduces an efficient feature selection approach for multivariate time-series of heterogeneous sensor data within a pervasive computing scenario. An iterative filtering procedure is devised to reduce information redundancy measured in terms of time-series cross-correlation. The algorithm is capable of identifying non-redundant sensor sources in an unsupervised fashion even in presen...
متن کاملAnalysis of Correlation Based Dimension Reduction Methods
Dimension reduction is an important topic in data mining and machine learning. Especially dimension reduction combined with feature fusion is an effective preprocessing step when the data are described by multiple feature sets. Canonical Correlation Analysis (CCA) and Discriminative Canonical Correlation Analysis (DCCA) are feature fusion methods based on correlation. However, they are differen...
متن کاملGraph Autoencoder-Based Unsupervised Feature Selection with Broad and Local Data Structure Preservation
Feature selection is a dimensionality reduction technique that selects a subset of representative features from highdimensional data by eliminating irrelevant and redundant features. Recently, feature selection combined with sparse learning has attracted significant attention due to its outstanding performance compared with traditional feature selection methods that ignores correlation between ...
متن کاملکاهش ابعاد دادههای ابرطیفی به منظور افزایش جداییپذیری کلاسها و حفظ ساختار داده
Hyperspectral imaging with gathering hundreds spectral bands from the surface of the Earth allows us to separate materials with similar spectrum. Hyperspectral images can be used in many applications such as land chemical and physical parameter estimation, classification, target detection, unmixing, and so on. Among these applications, classification is especially interested. A hyperspectral im...
متن کاملWhen Low Rank Representation Based Hyperspectral Imagery Classification Meets Segmented Stacked Denoising Auto-Encoder Based Spatial-Spectral Feature
When confronted with limited labelled samples, most studies adopt an unsupervised feature learning scheme and incorporate the extracted features into a traditional classifier (e.g., support vector machine, SVM) to deal with hyperspectral imagery classification. However, these methods have limitations in generalizing well in challenging cases due to the limited representative capacity of the sha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 30 19 شماره
صفحات -
تاریخ انتشار 2014